A self-updating road map of The Cancer Genome Atlas
نویسندگان
چکیده
MOTIVATION Since 2011, The Cancer Genome Atlas' (TCGA) files have been accessible through HTTP from a public site, creating entirely new possibilities for cancer informatics by enhancing data discovery and retrieval. Significantly, these enhancements enable the reporting of analysis results that can be fully traced to and reproduced using their source data. However, to realize this possibility, a continually updated road map of files in the TCGA is required. Creation of such a road map represents a significant data modeling challenge, due to the size and fluidity of this resource: each of the 33 cancer types is instantiated in only partially overlapping sets of analytical platforms, while the number of data files available doubles approximately every 7 months. RESULTS We developed an engine to index and annotate the TCGA files, relying exclusively on third-generation web technologies (Web 3.0). Specifically, this engine uses JavaScript in conjunction with the World Wide Web Consortium's (W3C) Resource Description Framework (RDF), and SPARQL, the query language for RDF, to capture metadata of files in the TCGA open-access HTTP directory. The resulting index may be queried using SPARQL, and enables file-level provenance annotations as well as discovery of arbitrary subsets of files, based on their metadata, using web standard languages. In turn, these abilities enhance the reproducibility and distribution of novel results delivered as elements of a web-based computational ecosystem. The development of the TCGA Roadmap engine was found to provide specific clues about how biomedical big data initiatives should be exposed as public resources for exploratory analysis, data mining and reproducible research. These specific design elements align with the concept of knowledge reengineering and represent a sharp departure from top-down approaches in grid initiatives such as CaBIG. They also present a much more interoperable and reproducible alternative to the still pervasive use of data portals. AVAILABILITY A prepared dashboard, including links to source code and a SPARQL endpoint, is available at http://bit.ly/TCGARoadmap. A video tutorial is available at http://bit.ly/TCGARoadmapTutorial. CONTACT [email protected].
منابع مشابه
Research on Updating of Urban Large Scale Road Map Based on High Resolution Remote Sensing Image
Road feature is one of the most important features of urban. The change of road reflects construction speed of an urban. map road updating timely and exactly becomes an urgent issue. Current method updating of road feature is carried out complete manually, and has the disadvantage of tending to lose changed road and low automatic level. The updating of map road feature including two parts: one ...
متن کاملClassification of Streaming Fuzzy DEA Using Self-Organizing Map
The classification of fuzzy data is considered as the most challenging areas of data analysis and the complexity of the procedures has been obstacle to the development of new methods for fuzzy data analysis. However, there are significant advances in modeling systems in which fuzzy data are available in the field of mathematical programming. In order to exploit the results of the researches on ...
متن کاملSpatio-temporal Modeling in Road Network Change Detection and Updating
Automatic road map updating has been one of the difficult and important research topics in the community of geomatics. An operational road map updating system should include three key components: the generation of a new version of roads, the automatic road change detection/updating and the spatio-temporal modelling of road data. Special considerations have to be given to the spatio-temporal mod...
متن کاملA Framework for Road Change Detection and Map Updating
The updating of road network databases is crucial to many Geographic Information System (GIS) applications such as navigation, urban planning, etc. This paper presents a comprehensive framework for image-based road network updating, in which the following three tasks are performed sequentially: road extraction from imagery, road change detection and updating, and spatio-temporal modeling. For r...
متن کاملRoad Data Updating Using Tools of Matching and Map Generalization
It is one of the important ways for GIS data updating based on map generalization. This paper analyzes the main steps for road data updating based on map generalization. As the core of this updating process, matching method considering the levels analyses and selective omission based on mesh density are developed. The approach for road data updating based on these two tools is proposed, which i...
متن کامل